Scalable Graph - Mining Techniques with Applications to Systems Biology

نویسندگان

  • Nagiza F. Samatova
  • Matthew C. Schmidt
  • Donald L. Bitzer
  • Jon Doyle
  • Anatoli V. Melechko
  • Wenbin Chen
چکیده

SCHMIDT, MATTHEW C. Scalable Graph-Mining Techniques with Applications to Systems Biology. (Under the direction of Nagiza F. Samatova.) Genetic engineers often seek to modify the genome of prokaryotic organisms in order to improve their efficiency in certain industrial processes. This requires an understanding of the biological systems that are responsible for the expression of the organism’s physical traits, or phenotypes, that are required by the given industrial process. This thesis approaches the problem of predicting phenotype-related biological systems by searching for network motifs that model the biological systems, which are primarily present in the biological networks associated with phenotype-expressing organisms. This thesis consists of three components that enable the prediction of phenotype-related biological systems through the described approach. The first component is a framework that reduces the complexity of identifying phenotype-related metabolic systems through a simplified model of organism-specific metabolic networks and an efficient search method. The second component is a framework that identifies phenotype-related protein functional modules by efficiently searching for a specific subset of cliques in a organism-specific functional association networks. Empirical results show that these frameworks identify subgraphs that model systems that are highly biologically relevant. The third component is a scalable, parallel maximal clique enumeration algorithm that allows applications, such as that of the second component, to enumerate maximal cliques in large scale biological networks. Runs of this parallel enumeration algorithm on large-scale networks demonstrate that the algorithm’s runtime scales linearly to thousands of processors. c © Copyright 2011 by Matthew C. Schmidt

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Data Science for Social Good - 2014 KDD Highlights

As the premier international forum for data science, data mining, knowledge discovery and big data, the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) brings together researchers and practitioners from academia, industry, and government to share their ideas, research results and experiences. Partnered with Bloomberg, it celebrated its 20 years in 2014 with the theme “Data Sc...

متن کامل

The Combinatorial BLAS: design, implementation, and applications

This paper presents a scalable high-performance software library to be used for graph analysis and data mining. Large combinatorial graphs appear in many applications of high-performance computing, including computational biology, informatics, analytics, web search, dynamical systems, and sparse matrix methods. Graph computations are difficult to parallelize using traditional approaches due to ...

متن کامل

Parallel Programming in the Age of Ubiquitous Parallelism

Multicore and manycore processors are now ubiquitous, but parallel programming remains as difficult as it was 30-40 years ago. During this time, our community has explored many promising approaches including functional and dataflow languages, logic programming, and automatic parallelization using program analysis and restructuring, but none of these approaches has succeeded except in a few nich...

متن کامل

Big Graph Mining: Algorithms, Anomaly Detection, and Applications

Graphs are everywhere in our lives: social networks, the World Wide Web, biological networks, and many more. The size of real-world graphs are growing at unprecedented rate, spanning millions and billions of nodes and edges. What are the patterns and anomalies in such massive graphs? How to design scalable algorithms to find them? How can we make sense of very large graphs? And what kind of rea...

متن کامل

Managing and Mining Graph Data Managing and Mining Graph Data

Graph mining and management has become an important topic of research recently because of numerous applications to a wide variety of data mining problems in computational biology, chemical data analysis, drug discovery and communication networking. Traditional data mining and management algorithms such as clustering, classification, frequent pattern mining and indexing have now been extended to...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010